home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Fritz: All Fritz
/
All Fritz.zip
/
All Fritz
/
FILES
/
UTILFILE
/
BIXUTIL.LZH
/
ERRMON.DOC
< prev
next >
Wrap
Text File
|
1986-03-10
|
12KB
|
297 lines
ERRMON 1.1
By Robert J. Newton
After experiencing a variety of problems with the hard disk on
my AT, I decided that I wanted a more informative report of disk
I/O errors than that provided by DOS. ERRMON is the result and
it did give me advance warning of total collapse of the infamous
CMI drive.
ERRMON is a resident program which inserts itself in the INT 13
chain. It then sits quietly watching for any error condition
returned by the disk driver. When an error is detected, it
springs to life and prints an error message on the screen. This
message is usually more informative than that provided by DOS.
The failing command and location of the offending area of the
disk will also be displayed. This information will appear as:
error message xxDxxCxxxxHxxSxxNxx
where the first two digits are the command sent to the bios
driver and D, C, H and S are the Drive and the starting Cylinder,
Head and Sector of the INT 13 request and N is the number of
sectors requested. The values are in hex; floppy drives will be
numbered 00, 01; hard drives will be numbered 80, 81. Note that
if a multi-sector request was made (N>1) then the values for C, H
and S will indicate the starting sector, not the actual sector
causing the error. The actual location of the error could be
found only by reading the controller registers, a very hardware
specific operation. Also note that some of this information may
be meaningless for certain INT 13 operations. After reporting,
ERRMON gracefully returns to the caller to let it do what it
wishes with the error.
ERRMON will let you know when DOS makes a first retry. However,
ERRMON does not scroll the screen. This means that it is
possible that a second message will overlay the first and you
would not be aware of repeated retries.
A word of caution. ERRMON has been written to respond to the
error codes returned by the IBM BIOS disk drivers on the PC, XT
and AT as shown in the various BIOS listings. Results with
other drivers are totally unknown although it might be assumed
that they map their error codes the same in order to achieve
compatibility. There are no checks for machine or drive type.
In addition, the errors defined for the PC/XT and AT fixed disk
drivers are not the same in all cases. ERRMON responds to the
codes for both. It is possible, but not likely, that say the AT
could through some glitch return an error code that is not
defined on the AT, but is on the XT. Instead of displaying an
"Undefined error" message, ERRMON would display the message
1
defined for the XT. This possibility was considered so remote
that it was not trapped.
The only error that ERRMON does not respond to is applicable
only to the AT. Its floppy disk driver may report an error that
is related to the dual nature of the AT's high capacity drives.
This is not a true i/o error and is ignored by ERRMON.
Many copy protected programs will cause ERRMON to display an
error message but the program will run normally. The program is
deliberately causing the error as a part of its check for an
original distribution disk. Other programs such as FORMAT,
DISKCOPY and DISKCOMP may report errors as a part of their
determining the type of disk with which they are dealing.
It has been reported that certain Teac drives used with certain
drivers will return one or two error reports on any read attempt
of a drive that is not already spinning. This is believed to be
due to the mechanical design of the Teac drive. To quote from
the IBM BIOS listings, "On read accesses, no motor start delay is
taken, so that three retries are required on reads to ensure that
the problem is not due to motor start-up". This means that DOS
must make at least one retry on every read access of these drives
unless the driver software has been written to make the retries,
a fact that has been hidden from you. It has also been reported
that the design of these drives causes them to have problems with
some copy protection schemes.
ERRMON's overhead is essentially nil, five instructions when
there is no error. Approximately 1K of memory will be used,
depending on how much environment space is in use when it
is loaded.
The video attribute for the error messages has been set to 0F,
intense white. The messages will print in the lower right
corner of the screen. You may change both of these by using
DEBUG. The attribute byte is at offset 0332 and the screen line
is at offset 02E3. Note that the screen line is relative to 0
so that the last line is hex 18, decimal 24. For those with
MASM, you may change the EQUATEs at the beginning. You may also
change the message text. If you change the message text such
that the entire line would exceed the presently set width ERRMON
will truncate the line to the screen width.
Installation is simple, just type errmon. A successful
installation will return ERRORLEVEL 0, unsuccessful will return
255. It can be installed by an AUTOEXEC.BAT, preferably before
anything else which might place itself in the INT 13 chain so
that it receives the register values returned by the disk device
driver. You can use >nul to avoid the sign on message.
2
Error messages
Note that some of these errors are applicable to both floppy and
hard disks; others are applicable only to hard disks. These
descriptions are by no means complete.
Sense failure (PC/XT only)
Status error (AT only) - The controller status register returned
an error condition, but the error register did not contain an
error code.
Write fault (AT only) - Indicates a hardware problem with the
drive.
Undefined error - An error code was returned for which BIOS has
not defined an error.
Drive not ready (AT only)
No response - The time alloted for an operation expired without
a response from the drive, what DOS calls Drive not ready.
Seek failure - An attempt to seek to the requested cylinder was
unsuccessful. Assuming the cylinder number was valid for the
disk, this probably indicates a hardware problem.
Controller failure - Probably indicates that the controller could
not successfully complete the requested command within the
alloted time.
EEC corrected error - An information report that the controller's
EEC algorithm successfully corrected a soft data error. You
should copy the file and delete the original if this happens
frequently to a file.
Bad CRC/EEC on read - The sector could not be successfully read.
This is the most common of all disk I/O errors, probably
outnumbering all others combined, and can have many causes:
defective media, dirty heads, misaligned drives, electrical
glitches ... The controller itself may make several attempts to
read the sector before giving up with an error report and DOS
will make five retries before it gives up and finally (usually,
but not always) lets you know that a problem exists. You may
still be able to recover the sector by answering the prompt with
R for retry a few times. If you can read the sector, the file
should be copied and the original deleted, otherwise resort to
RECOVER. The next write to the sector may "cure" the problem,
at least temporarily and unless there is physical damage to the
media.
3
My experiences with the AT CMI drive show that the classic
failure is associated with bad crc/eec sectors which appear
rather suddenly in large numbers, possibly after a few days of
occasional errors which are overcome by retries. The flawed
sectors can appear even in areas of the disk which have not been
written to except by the low level formatter. Further, they
shift in location and number. If this happens, it is best
not to delay, hoping that a low level reformat with the Advanced
Diagnostics or magic incantations will avoid future problems.
Sooner or later it will happen again. You may as well have the
drive replaced quickly even though the diagnostics report no
errors.
Bad track (PC/XT only) - An operation was attempted on a track
flagged bad by the low level formatter. Such areas should have
been marked bad in the FAT.
Bad sector (AT only) - Similar to Bad track.
DMA boundary crossed - This indicates a software problem;
DMA cannot operate across 64k segment boundaries.
DMA overrun - The controller could not get DMA access; a retry
should succeed.
Drive init failure - Probably indicates an attempt to initialize
an invalid drive number.
Drive reset failure - An attempt to reset the drive system
failed.
Sector not found - The requested sector number could not be
found. Assuming the values were valid, this indicates a flawed
disk. Absent a very capable disk zapper and the knowledge of its
use, the only cure is a low level format. It is also possible
that a disk made in one drive cannot be read in another because
one of them is out of alignment.
Disk write protected - Take the tab off. If there is no tab
the drive's sensors are probably bad.
DAM not found - The Data Address Mark recorded before the data
area of each sector by the low level formatter could not be
found. This usually indicates a disk which has suffered
electrical or physical damage; see the comments under Sector not
found. This error also will be reported when trying to access an
unformatted disk or possibly one with a foreign format.
Bad drive or command - An invalid drive number or command was
sent to the bios disk driver.
4
Version 1.1 adds a report of the command and disk area causing
the error.
ERRMON is (C) Copyright 1985 by Robert J. Newton, but hereby
released to the public domain for private non-commercial use. It
may be freely copied and distributed, but no consideration may be
requested other than any customary handling fees charged by
recognized user's groups. No warranties of any kind are provided
and by using the program the user assumes all risk.
5